The properties of quasispecies dynamics in 
molecular evolution 

V. V. Gafiychuk *and A. K. Prykarpatsky* 
February 2, 2008 

Abstract 

We consider the general properties of the quasispecies dynamical sys- 
tem from the standpoint of its evolution and stability. Vector field analy- 
sis as well as spectral properties of such system has been studied. Mathe- 
matical modelling of the system under consideration has been performed, 
keywords: quasispecies dynamics, Hamiltonian systems, gradient dy- 
namical system, biological evolution, complex system. 

1 Introduction 

This article is devoted to the theoretical study of a self-organization problem 
of an ensemble of interacting species and to developing a model of a naturally 
fitted coevolving ecosystem. It is well known since Eigen's work on replicating 
molecules PP that the quasispecies approach is very fruitful for modeling the 
fundamental behavior of evolution (See, for example 0] |SJ El[7j). Despite 
a huge amount of papers devoted to this problem biological evolution is too 
complex that we are still far from understanding real biological processes of 
self-organization. The matter is that real experiments and obtained data on 
the evolution of primitive systems need a comprehensive theoretical description 
that would allow to explain these data and put them into a proper context. In 
this case the central place is put to the investigation of intrinsic properties of a 
nonlinear system which describe the system evolution. We put our attention to 
Eigen's approach and will establish some new but very important mathematical 
properties of the system which could be useful for modeling many co-evolving 
ecosystem. In part we will use an approach devised in [S] for describing similar 
systems of evolution. 
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We start our analysis of qusispecies dynamics in molecular evolution with 
discussion of background of modeling aspects which appear to be very im- 
portant for further. The first principle of modeling such types of dynamics 
we accept here is based on probability consideration of molecular replicator 
processes and which are well described by resulting quasispecies concentration 
vector x £ [0, 1]™, where n £ Z + is a number of species, being normalized by 

n 

the full probability condition x j — 1- The latter condition is fundamental 

ensuring the full molecular number concentration and will play, in what will 
follow, an important role. Moreover, the set of such vectors x £ E n in the 
Euclidian space E n forms the algebraic symplicial submanifold S n -i C E n on 
which in reality the studied dynamics holds. For it to be described analytically, 
it is naturally to consider a representative symmetric matrix P eEnd_E n , such 
that P = / (g) /, </,/>= 1, for same vector / £ E n modeling the resulting 
replicator dynamics simultaneously ensuring two constraints: life on the sim- 
plex S n — i and conservation of the initial molecular system information during 
its replicator evolution. In general, such a dynamics can be represented in the 
following form: 

P(t) = U(t)P(0)V- 1 (t), 

with evolution parameter t £ 1, for some invertible mappings V, U : R — > 
GL(E n ), where P(0) £End_E n is an initial molecular dynamics state. Below 
we consider the symmetric replicator dynamics, whose the inverse replicator 
process matrix V £ GL(E n ) has to to coincide with the forward replicator 
process matrix U £ GL{E n ), that is the equality U = V holds. As a result, 
our dynamics is representable as 

P(t) = U(t)P(0)U-\t) (1) 

for all moments of time t £ R. Assuming smooth dependence of on t e R, 
one easily derives that the following Lax type dynamics 

dP/dt = [A, P] (2) 

holds, where by definition, the matrix A := dU/dtU^ 1 . For the matrix P €End_E™ 
to conserve its symmetricity, the matrix A £EndE n must be evidently skew- 
symmetric, that is A = — A* in E n . 

Return now back to analyzing the intrinsic structure of our matrix P eEndE'™, 
modeling replicator dynamics under regard. From the general form (J2J one sees 
that our dynamics possesses a priori so called trace-invariants, namely, all quan- 
tities SpP m are such for any m £ Z + , where Sp :End.E n — » R is the standard 
matrix trace operator. This fact may be naturally used within our modeling 
approach. Really, consider a representative vector / £ E n in the following 

n 

form: / = {^Jx~l £ R : i = l,n}. Then, the condition < /, / >= x j = 1 i s 

i=i 

n 

satisfied due to the equality SpP — ^ Xj and the fact that the latter quantity 

3=1 
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is conserved! Since P 2 = P, all of the other invariants are the same, introducing 
into the dynamics no additional constraint. Thus, we have naturally imbedded 
our molecular replicator dynamics initially in the space of concentration vec- 
tors x £ S n -i C E n into the matrix phase space V^P of symmetric projector 
mappings of co-dimension one, that is P = P* and dim(ImP) = 1. The latter 
phase space V is still called a Grassmann manifold possessing a lot of impor- 
tant intrinsic mathematical properties, which we shell use further for deeper 
analyzing our molecular replicator dynamics under regard. 

Consider an evolution equation modeling our molecule replicator dynamics 
in the general matrix Lax type form where A(x) = {Ajk : j,k = l,n} is 
a certain matrix depending on variables x = {xi € R+ : i = l,n} and P — 
{^fxjxk : j, k = 1, n}. In this case as the matrix P symmetric, it is evident that 
the matrix A must be skew-symmetric. Below we will put the main attention 
to processes similar to qusispecies dynamics considered in |2J. Since we 
will consider x € [0, 1]™ as a concentration vector of quasispecies, it has to be 
nonnegative for all time, so the system Q now is defined on the nonnegative 
orthant K™ R™ : x, > 0} . 

Let us now suppose a system evolves due to the Eigen positive feedback as- 
sociated with the terms corresponding to the increase of concentrations Xj , j = 
1, n : 

n 

dxj/dt = ^ a jkXkRk(x-) ~ Fj(x), (3) 
fe=l 

where an element ajk expresses the probability that a molecule k copies into a 
molecule j and Fj (x) denotes the corresponding inverse sink term. In turn the 
element ajj gives the probability that a molecule j replicates faithfully, Ru (x) 
is the fitness of the molecules of k type and characterizes their replication rate. 

We have to determine the sink term Fj(x),j = l,n , in order to fulfill the 
governing condition on the dynamic determined on the (n — 1) dimensional 
simplex 

= jxeR;:f> 4 = lj. (4) 

In order to get the corresponding source term in the equation © let us 
determine elements A-jk,j, k = l,n of the skew symmetric matrix A in as 



. 1 

A;* - ^ 



ajk\Jxk/xjRk{x) - a,kj\Jxj/xkRj(x.) 



(5) 



It is easy to observe here that there exists the matrix A = {^ajk\/xk/xjRk(x) : 
j, k = 1, n}, such that A = A — A*. Substituting expression (JSJ into we get 

n 

that Fj(x) = XjRj{x) ^ o-kjtj = l,n. As a result the governing equation for 

k=l 

the quasispecies dynamics takes the form: 
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n n 

dxj/dt = ^ajkXkRktx) - ^ ^XjX k ak 3 \J Xj/x k Rj(x) 

k=l k=l 

n n 

= X! a jk x kRk (x) - XjRj (x) a>kj 

k=l k=l 
n 

= y^(ajfeXfc.Rfc(x) - XjRj(x))a k j (6) 
It is easy to see after summing up equations (JBJ that 

n 

dxj/dt = 0, 

meaning in this case that the evolution dynamics is determined really on the 
simplex Q. 

It should be mentioned that the system © is a little bit diverse from systems 
considered in a set of papers |2 El El El El E] mainly by sink terms. But namely 
only such a form of the sink term ensures the important simplex condition 

n 

^2xj = 1 for all t € R, without additional constraints involved in the model. 

j 

Note here that, the system © is really representable in the evolution form 
|QJ that can be checked easily, and the matrix P eEndR™ is a one dimensional 
symmetric projector, that is P 2 = P, P* = P for t 6 1, being important 
for our further studying the structure of vector field Q on the corresponding 
projector matrix manifold V |l(Jllllj . The component vector form of the system 
(El can be also represented as 

dx/dt = AR(x)x-BR(x)x. (7) 

with A = {a,jk ' j,k = 1, n}, B = diag < ^2 a kj '■ j — 1, n \ , and -R(x), x e5„_i 

U=i J 

being a fitness matrix expression. If the fitness matrix i?(x), x €.S n -i is 
diagonal one that is R = R = diag {Rj : j = l,n} and does not depend on 
x £S n -i, the system Q) evidently will be linear. Thereby the solution of such 
a system can be obviously represented as 

x(t) = x(0) exp[(A - B)R]t, 

where x(0)€S n -i is a concentration of population of each type at initial time. 

2 Vector field analysis: imbedding into gradient 
structure 

Since our system dynamic flow m reality lives on the projector matrix of 
Grassmann manifold V3P all its properties can be naturally extracted 
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from deeper analysis of its structure on this manifold. In particular, it is 
important to know where the vector field © admits the structure of gradient 
type endowed with some Lyapunov function, ensuring the existence of stable 
states on the compact Grassmann manifold V . In order to study the gradient 
field structure of the flow if any on the projector matrix manifold P9P let 
us consider a smooth functional $ : V — >R, whose usual variation is given as 

m(P) := Sp(ZMP) (8) 

with a symmetric matrix D eEndP/ 1 and SpiEndi?"^!! 1 being as before the 
standard matrix trace. Taking into account the natural metrics on P, we 
consider the projection V^M* of the usual gradient vector field upon the 
tangent space T(P) under the following conditions: 

<p(X; P) := Sp(P 2 - P, X) = 0, Sp(Vy>, V v *) \ v = 0, (9) 

holding on V for all X eEndlR™. The first condition is evidently equivalent to 
P 2 — P = 0, that is P € V . Thereby we can formulate such a Statement. 

Statement 1. The functional gradient V^f (i 5 ), P € V under the condi- 
tion JJjJ admits the following commutator Lax type representation: 

V^*(P) = [A,P] 

with A eendP™ being a skew-symmetric matrix satisfying the commutator 
equation 

A=[D,P], 

where D is a symmetric matrix. Really, consider the projection of the usual 
gradient V4 r (P) upon the tangent space T(V) of the Grassmann manifold 
V with P e V imbedded into EndP": 

V V *(P) =V *(P)-V^(Q;P), (10) 

where Q eEndP™ is some still unknown matrix. Taking into account the 
conditions ©, we find that 

V V *(P) = D-Q-P{D-Q)-{D-Q)P + PD + DP 

= PD + DP + 2PQP, (11) 

where we made use of the relationships 

V V *(P) = D-Q + PQ + QP) 

and 

P(D -Q) + (D- Q)P + 2PQP = D-Q. 
Now one can easily see from Ijlljl and the second condition in (|l()|l . that 

PQP = -PDP (12) 
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for all P e P, giving rise to the final result 

Vtp^(P) = PD + DP — 2PDP, (13) 

coinciding exactly with the commutator [[P,P],P]. Since the matrix admits 
the representation 

A = A - A* = DP - PD 

one gets right away that 

A = DP + S (14) 

with S = S* eEndP™. If there exists such symmetric matrices D and S GEndP" 
that ffUl is satisfied, then our model will be evidently of gradient type. In par- 
ticular, the matrix D eEndP™ found from (|14J) must enjoy the Volterra criteria 
D'(P) = D'*{P) for any P G V imbedded into EndP". 

It should be noted here that the Grassmann manifold V is also a sym- 
plectic manifold [101 II 1 1 whose canonical symplectic structure is given by the 
expression: 

lj {2) (P) := Sp(PrfP A dPP), (15) 

where dui^(P) = for all P e V, and the 2-differential form (|15|l is non- 
degenerate ^31 El upon the tangent space T(P). 

Let us assume now that £ : V — >R is an arbitrary smooth function on V . 
Then the Hamiltonian vector field X^ : V — *T(V) on V generated by this 
function subject to the symplectic structure (|T5)l is given as follows: 

X e = [[D s ,P},P], (16) 

where D^ GEndP™ is a certain symmetric matrix. The vector field : 
V —*T(P) generates on the compact manifold V the flow 

dP/dt = Xe(P), (17) 

being defined globally for all t G M. This flow by construction is evidently 
compatible with the projector condition P 2 = P. This means in particular 
that the condition 

-X £ + PX^ + X^P = (18) 

holds on P. Thus, we stated that dynamical system (0 being considered on the 
Grassmann manifold V can be Hamiltonian that makes it possible to formulate 
the following statement. 

Statement 2. A gradient vector field of the form (|16fl on the Grassmann 
manifold V is Hamiltonian with respect to the canonical symplectic structure 
HI 5(1 and a certain Hamiltonian function £ : V CendP"— >R, satisfying condi- 
tions 
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V£(P) = [D e ,P]- Z + PZ + ZP, 
Vf'(P) = V£'*(P), D e =P 



for some matrix Z GEndi?" for all P 6 P CP™. 

Consider now the (n— l)-dimensional simplex iS n _i as a Riemannian space 
Af™ -1 = SVi-iwith the metrics 

n 

ds 2 (x) := d 2 "^ | p (x) = gij(x)dxidxj \-p, 
where for z,j = = f>i = 1. 

* J i=l 

Relative to the metrics on M™ -1 we can calculate the gradient VgVf of the 
function W : P — >M and set on M™ -1 the gradient vector field 

dx/dt = V a *(£c), (19) 

n 

with xEMg~ x , that is the condition J^Xi = 1 is satisfied a priory. Having 

i=l 

calculated l|19|) . we can formulate the next statement. 

Statement 3. The gradient vector fields V^W on P and V ff ^ on M™ -1 
are equivalent, or in another words, vector fields 

dx/dt = V g ^(x) (20) 

and 

dP(x)/dt = [[D(x),P(x)] , P(x)} (21) 

generate the same flow on M^ 1 . 

As a result from the Hamiltonian property of the vector field V^Vt on the 
Grassmann manifold P we get such a new statement. 

Statement 4. The gradient vector field V g ^ (|19(1 on the metric space 
M™ _1 where n = 2m + 1 is Hamiltonian subject to the non-degenerate sym- 
plectic structure 

uf\x):=^\P)\ Mr , (22) 

for all x£M 2m with a Hamiltonian function £^ : M 2m — > R, where := £ | M ™-i 
, £ : P — >K is the Hamiltonian function of the vector field X,t on P. 

Otherwise, if n G Z + is arbitrary our two flows (|20|l and l|21|l are on P only 
Poissonian. 
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3 Spectral properties 



Consider the eigenvalue problem for a matrix PeP, depending on the evolu- 
tion parameter (el: 

P{t)f = A/, (23) 

where / £ M™ is an cigenfunction, A £ 1 is a real eigenvalue since P* = P, 
i.e. matrix P € V is symmetric. It is seen from the expression P 2 = P that 
specP(i) = {0, 1} for all ( G 1. Moreover, taking into account the invariance 
of SpP = 1 we can conclude that only one eigenvalue of the projector matrix 
P(t), t £ R, is equal to 1, all others being equal to zero. 

In general case the image ImP C K™ of the matrix P(t) £ V for alH £ M is 
fc— dimensional, k — rankP, and the kernel KerP C R™ is (n — fc)-dimensional, 
where k £ Z + is constant, not depending on ( e M. As a consequence we 
establish that at k = 1 there exists a unique vector /o £ R™/(KerP) for which 

Pfo = fo, (24) 

Due to the statement above for a projector P eEndP™ we can write down 
the following expansion in the direct sum of mutually orthogonal subspaces: 
E n =KerP©ImP. 

Take now fo £ M. n satisfying the condition 1241) . Then in accordance with 
H2Q(I the next statement holds. 

Statement 5. The vector fo £ E n satisfies the following evolution equa- 
tion: 

df /dt = [D(x),P(x)] f + C (t)/ , (25) 

where Co ■ M^K is a certain function depending on the choice of the vector 
fo SlmP. At some value of the vector fo SlmP we can evidently ensure the 
condition Co = for all t £ K.. Moreover one easily observes that for the 
matrix P(t) £ V one has the representation P(t) = fo ® fo, (fo, fo) =1, 
giving rise to the system (JSJ) if fo ■= {yfxj £ K+ : 3 ' = 1, n} £ E n . 

4 Discussion of the quasispecies dynamics 

Quasispecies dynamics is very interesting for researchers and there are a lot of 
papers devoted to computer simulation of it E] . In order to single 
out the characteristic features of the model stated above let us write down the 
model typically used for quasispecies dynamics. In vector form such a model 
can be written as 

dyi/dt = (AR- < R >)x (26) 

where R — diag {Rj : j = l,n} is the diagonal matrix with the Malthusian 
fitness values, < R >.= J27=i ^ iXi - The diagonal elements ajj, j = l,n, of 
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Figure 1: Quasispecies dynamics as a results of computer simulation of 
the equation l|2()l) - a and our model iJBJ - b. Initial conditions: x = 
(0.2; 0.45; 0.1; 0.25) T 

the matrix A — {a,jk : j, k = 1, n}. In correspond to the self- replication process 
and nondiagonal - to mutation. In order to fulfill the simplex condition we 

n 

have to put for each column djj = 1 — akj, j — l,n and hence the 

;.• i 

column sum will give rise to unity. In contrast to i|26|) in our model I© we do 
not require such a constraint in order to satisfy the simplex condition. 

The mutation matrix A in our model has to be nondiagonal. The choice 
of matrix A specifies the generation and recombination rates among different 
molecules in chemistry. A very interesting application of the model stated in 
the papers 01 E] is for describing quasispecies evolution. In framework of 
this approach the bio molecules are considered as a bitstrings of length L. In 
this case we have 2 L different molecules. As a result the mutation matrix A is 
of huge dimension. If we consider a bio molecules with L >> 1 this approach is 
practically out of reason and can not be treated for large bio macromolecule. In 
this case some simplification has been considered when certain macromolecules 
are grouped together in such a way that the number of independent variables 
is reduced to L + 1 [H] ■ In the framework of this approach a certain sequence 
(master sequence) is chosen beforehand and other sequences are grouped all 
into error classes, according to their Hamming distance from the chosen one. 
Sequences which have the same Hamming distance from the master one compile 
a one error class. Such reduction of the dimensionality make it possible to get 
the problem acceptable for computer simulation and to reveal some features 
inherent to real bio world. In this case we can write down the nondiagonal 
elements ajk of mutation matrix A as the probability of mutation string k to 
string j 

a ]k = q L - H ^{\-q) H ^ (27) 

where q is the probability that a particular locus of the chain is copied cor- 
rectly into the next generation i.e. the probability of replication, (1 — q) is 
the probability of mutation), L is a bit string length Hjk is Hamming distance 
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between string j and k and is denned as the number of positions, in which the 
two sequences differ. In order to note the difference between two model we 
simulate our model for simplest case L = 3. The plot of computer simulation 
with matrix given by formulae (|27|l is presented on Fig.l (a, b). We can easily 
see that having the same initial conditions and matrix replication mutation 
matrix A the results of the evolution process is quite different. 

Let us find a stationary frequency distribution in the framework of derived 
model iJBJ. If we make set the right hand side of the equation @) to zero we 
get for constant fitnesses the next expression 

n 

a jk x k Rk 

*i = k ^— n ■ (28) 

Rj a kj 

Analyzing the expression (|28|l one can see that the values of the frequencies 
do not depend on diagonal elements a,jj, j = l,n, and replicator process 
in this model is determined only by the input given by nondiagonal elements 
dfcj, k 7^ j = 1, n, being natural from the biological point of view. If the fitness 
value of some species Rj is much greater than all of the rest fittnesses, the values 
of Xj will be chosen less than all the others. In this case the results obtained 
for example for a single peak model landscape in the framework of l|2()|) , which 
lead to so called "phase transition" and vanishing of the corresponding species 
named there master ones if djj < 1/Rj be interpreted in another way [21 El E] 
The equation (|28l) can be evidently written down as a fixed point problem 

n 

x = Ax, where A = {cijkRk/ {Rj Y a kj) '■ k, j — l,n} with diagonal elements 

a,jj := 0, j = 1, n. So, its solution exists if the matrix A possesses the eigenvalue 
A = 1, or the determinant equation det(l — A) = 0, where 1 is unity matrix, 
is satisfied identically. But this fact is true for any matrix A and arbitrary 
parameters Rj £ = l,n. If this equation is not satisfied, the process 

of reducing some amount of species from the system happens, that is some of 
frequencies will become exactly zero and the resulting system remains to live on 
a simplex of lower dimension. The latter situation can be considered as a "phase 
reduction" naturally related with some threshold values of frequencies found 
from the determinant equation written above, against the notion of "phase 
transition" used in cited above articles. This behavior is of great interest 
for diverse applications since it can be interpreted as some kind of simplex 
reduction 5„_i — > 5„_; for 1 < I < n, taking place in some kinds of replicator 
dynamics models. Taking into account this aspect of these models and their 
virtual importance in studying biological replicator and other models, we plan 
to dwell on their deeper studying soon in another place. 
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